This Page Is Inserted by IFW Operations 
and is not a part of the Official Record 

BEST AVAILABLE IMAGES 

Defective images within this document are accurate representations of 
the original documents submitted by the applicant. 

Defects in the images may include (but are not limited to): 

• BLACK BORDERS 

• TEXT CUT OFF AT TOP, BOTTOM OR SIDES 

• FADED TEXT 

• ILLEGIBLE TEXT 

• SKEWED/SLANTED IMAGES 

• COLORED PHOTOS 

• BLACKORVERYBLACK AND WHITE DARK PHOTOS 

• GRAY SCALE DOCUMENTS 



IMAGES ARE BEST AVAILABLE COPY. 



As rescanning documents will not correct images, 
please do not report the images to the 
Image Problem Mailbox. 



EvaltLoatiog the arayed primeir extemisfoini reseqyeDmciinig 
assay of TP53 tymor syppressoir geoe 

Neeme T6nisson*^ Jana Zernant*, Ants Kurg^ Hendrik Pavel*, Georg Slavin*, Hanno Roomere*^ Aune Meiel*^ 
Pierre Hainaut*, and Andres Metspalu***§ 

*Asper, Ltd., 3 Oru Street, 51014 Tartu, Estonia; ^Institute of Molecular and Cell Biology, University of Tartu/Ertonian Biocentre, 23 Riia Street, 51010 Tartu, 
Estonia; and 'International Agency for Research on Cancer, 150, Cours Albert Thomas, F-69372 Lyon Cedex 08, France 

Communicated by C Thomas Caskey, Cogene Biotech Ventures, Ltd., Houston, TX, February 20, 2002 (received for review July 9. 2001) 



Identification of mutations in the tumor suppressor gene TP53 has 
implications for the molecular epidemiology and for the molecular 
pathology of human cancer. We have developed and evaluated an 
arrayed primer extension assay for covering both strands of a 
region of the coding sequence containing more than 95% of the 
mutations described so far in TPS 3. On average, 97.5% of the 
arrayed TP53 gene sequence can be analyzed from either sense or 
antisense strands, and 81% from both strands. A patient DNA 
sample is amplified and annealed to arrayed primers, which then 
promote DNA polymerase extension reactions with four fluores- 
cently labeled dideoxynucleotides. The TP53 gene chip spans exons 
2-9 plus two introns from both strands. The performance of the 
assay was evaluated by using freshly extracted genomic DNA, as 
well as DNA extracted from archival (paraffin-embedded) DNA 
samples. The arrayed primer extension-based TP53 gene test pro- 
vides an accurate and efficient tool for DNA sequence analysis 
of this frequently mutated gene for both research and clinical 
applications. 

APEX I oligonucleotide array | chip 

The evidence is growing that specific mutations in the TP53 
gene can represent important factors for the prognosis of 
cancer and for the response to various types of cytotoxic therapy. 
Furthermore, patterns of TP53 mutations have differed consid- 
erably from one type of cancer to the other (1-4). However, 
screening for TP53 mutations gene has yet to become a routine 
in clinical or epidemiological practice, mainly because current 
detection technologies are labor-intensive and have prohibitive 
costs for large-scale prospective studies. Another strong limita- 
tion to routine analysis of TP53 mutations resides in the fact that 
many tumors contain an excess of wild-type TP53 as compared 
with mutant, resulting from the presence of intact alleles in 
tumor as well as in noncancer cells (stroma, inflammatory cells, 
blood vessels). 

In this report we describe the development of an arrayed 
primer extension (APEX) assay for the rapid and sensitive 
detection and identification of mutations in the TP53 gene. 
APEX is a genotyping and resequencing technology that com- 
bines the advantages of Sanger dideoxy sequencing with the 
parallel ization and high- throughput potential of the microarray 
format. A DNA sample is amplified, fragmented enzymatically, 
and annealed to arrayed primers, promoting sites for template- 
dependent DNA polymerase extension reactions by using four 
fluorescently labeled dideoxynucleotides. Each base is probed 
with two primers, one for the sense and another for the antisense 
strand (5). genorama imaging system and genotyping software 
(Asper Ltd., Tartu, Estonia, www.asperbio.com) were used for 
imaging and semiautomatic sequence analysis (Fig. 1). 

The principle of sequencing by primer extension on oligonu- 
cleotide array has been successfully applied for the systematic 
identification of all common TP53 mutations in human cancers. 
The TP53 microarray presented here spans exons 2-9 [contain- 
ing more than 98% of all mutations described so far in human 
cancer (6)], together with flanking splice sites and introns 5 and 



8 from both strands (total of 1,218 bases; Fig. 2). This system has 
been designed to allow the detection of most common mutations 
(missense, nonsense, tandem, insertions, deletions, and complex 
mutations) and all identified polymorphisms in the TP53 coding 
sequence. We found that this system allows for sequencing of an 
average of 97.5% of the arrayed TP53 gene from either sense or 
antisense strand, whereas 81% of the whole sequence was 
simultaneously analyzed from both strands. The length of this 
simultaneous DNA sequence readout (1.2 kb from both strands) 
outmatches the limits of the current standard for mutation 
detection, automated dideoxy sequencing. We describe perfor- 
mance of this assay, evaluated by using 100 normal genomic 
DNA samples from the Estonian population, plus DNA ex- 
tracted from 11 archival pathology sections (paraffin-embedded 
resections of primary esophageal cancers), which were demon- 
strated to contain TP53 mutations by using classical mutation 
detection methods [temporal temperature gradient gel electro- 
phoresis (TTGE), followed by direct sequencing]. 

Two silent, six missense, one splice-site mutation, and an 
insertion were confirmed by both techniques (Table 1). One of 
the tumors showed a missense mutation at codon 290 by APEX, 
instead of a silent, point mutation as detected by TTGE plus 
dideoxy sequencing. In addition, one point mutation, which 
escaped detection by TTGE plus dideoxy sequencing, was iden- 
tified by APEX. On the basis of these results, we conclude that 
the APEX-based TP53 mutation assay provides an accurate and 
cost-efficient tool for DNA sequence analysis of this frequently 
mutated gene. Additional oligonucleotides or regions of the 
TP53 gene can be easily added to the assay. This prototypic assay 
represents a valuable platform for the development of diagnostic 
sequencing assays, for TP53 and other genes of interest. 

Methods 

Template Preparation. Exons 2-9 of the TP53 gene were amplified 
from genomic DNA in three amplicons: exons 2-4 (with 5'- 
TGGAAGTGTCTCATGCTGGA and 5'-ATACGGCCAG- 
GCATTGAAGT primers), exons 5-6 (with 5'-TCTGTCTCCT- 
TCCTCTTCCT and 5'-CACTGACAACCACCCTTAAC 
primers), and exons 7-9 (with 5'-CTCATCTTGGGCCTGT- 
GTTA and 5'-GCCCCAATTGCAGGTAAAAC primers). A 
20% fraction of the dTTP in the amplification mbtture was 
substituted by dUTP (5, 7). The amplification products were 
concentrated and purified by ethanol precipitation in the pres- 
ence of ammonium acetate. Fragmentation and functional in- 
activation of the unincorporated dNTPs was achieved in a 
one-step reaction by addition of shrimp alkaline phosphatase 
(Amersham Biosciences, Piscataway, NJ) and thermolabile ura- 



Abbreviations: APEX arrayed primer extension; TTGE, temporal temperature gradient gel 
electrophoresis. 

^To whom reprint requests should be addressed. E-mail: andresOebc.ee. 

The publication costs of this article were defrayed in part by page charge payment. This 
article must therefore be hereby marked 'advertisement^ in accordance with 18 U.S.C 
§1734 solely to indicate this fact. 



www.pnas.org/cgi/doi/ 1 0, 1 073/pnas.082 1 00599 



PNAS I April 16, 2002 | vol. 99 | no. 8 | 5503-5508 



B Thr170 ACG^ACA 

position. individual 

A 

•f wt base signals 





- ; - rr^i'^^-^ ^^^yy^^....,,,-,.,^^^ y,^. . 


IBi 


ICO a s:' :• 
0 :i -J ic; 


m 

' 1,-1 >■ c ioo 0 0 iCi;* J* 

0 no c 0 





an hi sense 



antisense 



average 
sigiiaXs 
of sait^le 



wt reference distance and reference in£ormation 



Fig. 1. TP53 APEX-based sequencing assay. iA) Grayscale images for each fluorescent dideoxy nucleotide are used for the sequence analysis. (B) Silent mutation 
in the third base of codon 170 of TP53, analyzed by the genorama software. Signals from the analyzed base are averaged and the signal pattern obtained is 
compared with the wild-type (wt) reference. Grayscale bitmaps corresponding to all four fluorescent dideoxy nucleotides at the base to be determined are shown 
enabling visual analysis. A signal in the sense area and T in the antisense area are indicative for mutation in the current tumor sample. The distance and reference 
information consist of: (/) the distance measure of the given signal pattern from the wt reference pattern; (//)the wt base with relative signal intensities at four 
(A, C, G, and T) fluorescence channels; {Hi) percentage of the signal pattern at the wt reference cluster database for the given base; and (/v) index given by the 

GENORAMA SOftwarC. 



cil N-glycosylase (Epicentre Technologies, Madison, WI) (5) and 
heat treatment. 

Oligonucleotide Microchips. APEX primers were designed, accord- 
ing to the wild-type sequence of the human TP53 gene (accession 



no. U94788) for both sense and antisense directions. The 25-mer 
oligonucleotides with 12-carbon amino linkers at their 5' end 
were obtained from Genset (Paris). Used for spotting the 
oligonucleotides were 24 x 60 mm aminosilane plus phenylene 
diisothiocyanate-coated microarray slides (8) (Asper, Ltd.). 



1 9 

2 e 

AH 

B 



"] |exon 5 ' 



exon B 



S3 



Fig. 2. Performance of the APEX-based sequencing in different regions in theTP53 gene. In some regions of the gene, sense and antisense strands have different 
performance in APEX. For instance. In exon 7, which can be viewed as an extreme case, the antisense strand signals are very good. At the same time the signals 
from the sense strand perform below the average level. (A) Performance of oligos corresponding to different regions of the pS3 gene from sense (s) and antisense 
(as) strands is shown with data from two independent series of experiments. The upper bar (1) represents APEX performance from 20 repeated APEX reactions 
with the same wild-type reference DNA, whereas only the automatically clustered signal intensities are used. The lower bar (2) represents visually corrected data 
from 1 00 healthy individualssequenced by APEX. Both of the patterns are highly overlapping. (S) Color code and scale for the image. Oligos with signals matching 
the wt sequence at least in 75% of experiments are shown black. Oligos with zero signals or signals different from the wt sequence are shown yellow, orange, 
or red. 



5S04 I wwAV.pnas.org/cgi/doi/10.1073/pnas.082100599 



TOnisson ef a/. 



Table 1. Mutations detected by TTGE and APEX assays 

Mutation 





Codon 


Nuf Ipotirfp 


Amino 
acid 


TTGE + 
sequencing 


APEX 


8 


172 


GTT->irr 


Val— Phe 


+ 


- 




315 


TCr^TGT 


Ser— Cys 


- 


+ 


10 


175 


CGC^AC 


Arg— His 


+ 


+ 


13 


213 


CGA->CGG 


Arg 


+ 


+ 


1 c 


1 /u 


ACG— *ACA 


Thr 




-1- 


16 


Intron 


TAG-^TAA 




+ 






5 splice 










20 


179 


CAT-<GT 


His-*Arg 


4- 




22 


170 


ACG-*ACACG 


ins 2 bp 


+ 


+ * 


25 


273 


CGT-*IGT 


Arg—Cys 






31 


290 


CGC^GA 


Arg 








290 


CGC-<CC 


Arg— Pro 




+ 


48 


164 


AAG->ACG 


Lys— Thr 


+ 






286 


GAA-*AAA 


Glu— Lys 


+ 


+ 


53 


175 


CGC->CAC 


Arg— His 


+ 


+ 




213 


CGA^CAA 


Arg— Gin 


+ 


+ 


11 




Total 




13 


12 



Concordance with TTGE plus dideoxy sequencing as the reference was 10 of 
13. One mutation was identified in the same codon by APEX compared with 
dideoxy sequencing (sample 31). One mutation was identified by APEX only 
(sample 8). 

♦Presence of mutated base (A) determined from the sense strand only. 



Primers were diluted to 50 concentration in 100 mM 
carbonate buffer, pH 9.0, and spotted onto the activated surface 
with Affymetrix 417 arrayer (Affymetrix, Santa Clara, CA). The 
slides were blocked with 1% ammonia solution and stored at 4°C 
until needed. Washing steps with 95°C water and 100 mM NaOH 
were performed before APEX reactions to reduce the back- 
ground fluorescence and avoid rehybridization of unbound 
oligonucleotides to the APEX slide. 

Genomic DMA Samples from Estonian Population. The genomic DNA 
samples from healthy individuals were obtained from the Insti- 
tute of Molecular and Cell Biology, University of Tartu/Esto- 
nian Biocentre, and comprised a subset of samples collected 
within the framework of the project The Influence of Genetical 
and Environmental Factors on Health of Estonian Population of 
the Estonian Ministry of Social Affairs. The project had been 
approved by the ethics committee of University of Tartu. 
Informed consent was signed by all the participants of the study. 

APEX-Based Resequencing. One-third of a product from 50 ^i^ of 
PCR was used for each primer extension reaction. The APEX 
mixture consisted of 10 /llI of fragmented product, 4 units of 
Thermo Sequenase DNA polymerase (Amersham Pharmacia), 2 ^1 
of Thermo Sequenase reaction buffer (260 mM Tris-HCl, pH 
9.5/65 mM MgCl2) (Amersham Pharmacia), and 2 ptM final 
concentration of each fluorescently labeled ddNTP: Texas Red- 
ddATP, Cy3-ddCrP, fluorescein-ddGTP, Cy5-ddUTP (Amer- 
sham Pharmacia; NEN). The DNA in buffer was denatured at 95°C 
for 5 min. The enzyme and dye terminators were immediately 
added to other components, and the whole mixture was applied to 
prewarmed slides at 58°C. The reactions were allowed to proceed 
20 min under parafilm and stopped by washing at 95°C for 2 x 90 s 
in MilliQ water. A droplet of SlowFade Light Antifade Reagent 
(Molecular Probes) was applied to the microchips to limit bleaching 
of the fluorescein. The slides were imaged with the Genorama 
imaging system (Asper, Ltd.), at 20- /xm resolution. 
The TP53 gene sequence and mutations were identified by 




GENORAMA 3.0 genotyping software by using clustered signal 
patterns from a sequenced wild-type DNA as the statistical 
reference. The distances of signals from the clusters were used 
as measures of match with the wild-type gene sequence. Distance 
(d) of the sample signal pattern compared with the signal 
patterns in the wild-type reference cluster database were calcu- 
lated as follows: 

VE(^c-N,)^ 

where Nc is the signal intensity of the given nucleotide (A, C, G, 
and T) in the cluster database, and Ns is the signal intensity of 
the given nucleotide of the DNA sample. 

Results 

Oligonucteotide Design. Each base in TP53 is identified by two 
unique 25-mer oligonucleotides, one for . sense and one for 
antisense strand (total of 2,436 oligonucleotides for the analyzed 
sequence). The oligonucleotides are based on TP53 wild-type 
sequence (accession no. U94788), with their 3 'ends one base 
upstream of the base to be identified. The vast majority of these 
oligonucleotides performs well in APEX. A fraction of the 
oligonucleotides formed secondary structures, either enabling 
signals from self-priming or interfering with annealing to test 
DNA, and therefore needed redesigning. Although the 3' end 
and its proximity of the primers cannot be modified, the internal 
part of the primer may be changed by incorporating a mismatch 
without seriously affecting the target-specific priming ability. 
Oligonucleotides for 5.9% of the sequenced bases from either 
strand were redesigned by introducing a mismatch to reduce the 
stability of the predicted dimers and avoid self-priming. After 
modification, 62% of these oligonucleotides generated signals 
only in the presence of target DNA and not from oligonucleotide 
dimers; 21% of the modified oligonucleotides did not give any 
signal either from the target DNA or self-priming because of 
their reduced hybridization ability; 17% of the modified se- 
quences produced weak or undetectable signals in half of the 
experiments. None of the modified oligonucleotides generated 
false-positive signals in the absence of the target DNA. 

Some areas of the gene are difficult to sequence from both 
strands (Fig. 2A) for multiple reasons, including sequence repeats, 
regions with very high GC content, sequences corresponding to 
oligonucleotide with AT-rich 3' ends, etc. However, only a very 
limited number of bases (2.5% on average) were not detected from 
either strand at the present state of assay development. 

Sequence Analysis Algorithm. As a general strategy in APEX, the 
sequence can be identified either from a single experiment or 
interpreted on the basis of a statistical analysis. Statistical 
analysis facilitates identification of deviations from the wild-type 
reference signal pattern indicative of mutations (Fig. 15). The 
level of possible secondary signals in the wild-type reference is 
useful for determining a threshold for acceptance or rejection of 
signals interpreted as mutations. A sequenced wild-type genomic 
DNA from a healthy individual was used to create a reference 
database of signal patterns. The signals from all of the oligonu- 
cleotides were analyzed by a clustering algorithm, grouping the 
signal patterns from four fluorescence channels. Each base in the 
sample was compared with the wild-type reference, and the value 
of the distance {see Methods) between the signal pattern and the 
corresponding wild-type base was used as a measure for calling 
the given base. Zero distance indicates a perfect match between 
the given base and the wild-type reference base. The analysis was 
performed in an automated manner, and only a subset of signals 
needed visual examination. 



Tdnisson et at. 



PNAS t April 16, 2002 | vol. 99 | no. 8 | 5505 



9 



160 
120 
"o 80 
40 
0 



-^-Arg273His 
~a— Arg248Trp 



20 40 60 

%ofArg273His 



80 100 



Fig. 3. APEX sensitivity for the fraction of mutated DNA. Relationship 
between the fraction of mutated DNA and the value of distance measure from 
the wild-type reference signal pattern. PCR products from two cDNA clones 
with known missense mutations Arg273His (CGT— ►CAT) and Arg248Trp 
(CGG-+TGG) were used for the titration. Both mutations are analyzed at the 
DNA strand with G to A change. Five percent content of mutated DNA is 
detectable in both cases (indicated by circles). Error bars represent the stan- 
dard deviations. 



APEX Performance in TP53 Sequencing Tests with Numerous Samples. 

To evaluate performance of the TP53 APEX assay in large-scale 
studies, 100 normal DNA samples from the Estonian population 
were tested for common, single-nucleotide polymorphisms and 
for possible point mutations. A common single-nucleotide poly- 
morphism in exon 4 (Arg-72 -» Pro; Arg72Pro) was found with 
minor allele frequency of 0.26. The identified single-nucleotide 
polymorphism matches the Hardy-Weinberg equilibrium by the 
calculated value (P > 0.05). We also detected two silent point 
mutations in codons 36 (CCG to CCA) and 139 ( AAG to AAA) 
(6) in two analyzed samples. The first one may correspond to a 
rare polymorphism, which has been identified in up to 4% of the 
general population (9). At present, no evidence has been pub- 
lished regarding the status of silent mutation at codon 139, but 
it cannot be ruled out that it might also correspond to a 
previously unrecognized, rare polymorphism. 

On average, 97.5% of the arrayed TP53 sequence was identified 
in our current version of the TP53 assay from either sense or 
antisense strand, and 81% from both strands. In the best cases, 
respectively, up to 99.8% and 96% of the sequence were analyzed. 

Sensitivity for Mutated DNA. DNA extracted from tumor samples 
always contains a background of normal DNA. APEX sensitivity 
for the minimal identified percentage of mutated DNA was 
titrated by mixing PCR products obtained from the mutant 
(Arg248Trp and Arg273His) TP53 cDNA clones at different 
ratios (Fig. 3). The mutations are located in different exons, and 
the clones were therefore used as a competitor fraction of 
normal DNA for each other. The signal patterns were different 
from the wild type, and both mutations were detected even if the 
sample contained as little as 5% of mutant DNA. The samples 
with zero percent of mutated DNA were matching the wild-type 
reference DNA (Arg273His with zero distance and Arg248Trp 
with a distance value of 3). The mixed samples with 5% of 
mutated DNA, on the contrary, did not match the signal pattern 
of the reference wild- type sample (average distances, 19 for 
Arg273His and 58 for Arg248Trp). In fact, 5% of the mutated 
DNA allowed identification from the analysis software window 
(Fig. \B) by eye. APEX sensitivity to detect deletions was 
titrated with del 13-19 TP53 cDNA clone. The first base after 
deletion is detected instead of the first deleted base (10, 11). The 
deletion was detected with sensitivity equal to a point mutation 



The fraction o€ DK& 
with 21bp deletion 

0% 

15% 

85% 

100% 



Deletion footprint 



ATCCTAGCGTCGAGCCCCCTCTr;Au"'rf:A(:;crv^AnAT?TTCAGACCTArGGAA 




Deletion footprint 



Fig. 4. APEX sensitivity for detecting deletions. Detailed patterns of signal 
intensities in the deletion area. The actual footprint (with weaker or missing 
signals) exceeds the deleted sequence by 13 to 15 bases in the 3' direction from 
either strand because of partial annealing wrth the target sequence. The first 
base after deletion is detected instead of the first deleted base. Because of 
cDNA used as the template, Intron 3 footprint (IFP) is detectable in the 
antisense strand. 



by analyzing first base after the deletion. The complementing 
algorithm, based on detection of decreased signal intensities and 
deletion-specific footprint was less sensitive and required at least 
15% mutant sequence for detection (Fig. 4). The actual footprint 
(with missing or weaker signals) exceeds the deletion by 13-15 
bases at 3' direction from either strand because of the partial 
anneahng with the target sequence (Fig. 4). 

Blind Test with Tumor Samples. The tumors tested were from a 
series of squamous cell carcinomas of the esophagus collected in 
Iran between 1992 and 1998. These cancers often contain TP53 
mutations and are very good examples of a type of cancer in 
which TP53 mutation analysis may have a strong impact in 
clinical and epidemiological applications. Eleven samples, with 
a total of 12 point mutations and a 2-bp insertion in TP53, 
previously identified by TTGE plus manual or automated 
dideoxy sequencing of the extracted heteroduplex band, were 
used in a blind test for sensitivity and accuracy of APEX. 
Sequencing of the heteroduplex band has superior sensitivity to 
direct sequencing but requires gel purification of the PCR 
product. The total number of mutations determined was similar 
in both techniques. Two silent, six missense, one splice site 
mutation, and an insertion were concordantly identified (Table 
1). A missense mutation at the codon 290 was found by APEX 
instead of a silent point mutation as identified by TTGE plus 



SS06 I www.pnas.org/cgi/doi/10.1073/pnas.082100599 



Tdnisson er al. 



dideoxy sequencing. One missense mutation not previously 
identified by TTGE plus sequencing was de novo identified by 
APEX. Only wild-type APEX signals were present in two 
samples, where missense mutations were previously determined. 

Discussion 

A practical approach to TP53 mutation screening has to combine 
affordable cost, high throughput, high specificity, and high sensi- 
tivity. So far, the most advanced, current alternative to dideoxy 
sequencing is the GENECHIP p53 assay (Affymetrix, www. 
affymetrix.com), which has been recently evaluated (12-14). The 
Affymetrix chip has good overall performance but a limited ability 
to detect deletions and insertions. Promising efforts have been 
made to couple the oligonucleotide array technology to single-base 
extension reaction by the DNA polymerase (10). Another recent 
approach, pyrosequencing, has shown accurate results for detection 
of mutations in a few exons of TP53 (15). 

The currently described APEX-based sequencing approach by 
comparing a sample with the wild-type reference by the distance 
measure is comparable with the genechip p53 assay where a score 
from a mixture of variables between the wild-type reference and a 
given sample is calculated. The higher the score for a probe set 
contributing to a given base, the higher the likelihood for the base 
being mutated (12, 14). In the genechip p53 assay, the single cutoff 
level for calling mutations has been reported to be unsatisfactory 
(14). The same situation could apply to the TP53 APEX-based 
sequencing assay, but further studies are needed to evaluate the 
possible benefit of approaching each base as a separate entity. The 
applicable cutoff value for base calling also depends on whether the 
sample is analyzed for germ-line or somatic mutations. In the 
current work, prescan of the sequence was made with a general 
cutoff distance. The positions exceeding the threshold distance 
from the wild-type signal pattern were visually verified. Just one 
APEX oligonucleotide per each sequenced base and the general 
low noise makes possible the fast visual inspection at positions 
where the software is giving ambiguous results. 

The results from the 100 healthy individuals analyzed are en- 
couraging for applying APEX in large-scale TP53 studies, whereas 
single-nucleotide polymorphism data can have an impact on the 
analysis of individual risks or of cancer outcome. The identified 
Arg72Pro polymorphism has recently been proposed to play a 
role in tumorigenesis. Controversial evidence exists that the 
Arg-72 allele might be more sensitive to degradation induced by the 
oncoproteins of human papilloma viruses, suggesting that this 
polymorphism may predispose to cervical cancer (16). On the other 
hand, recent studies have shown that the cellular interactions of 
mutant p53 protein may be different depending of the allelotype of 
codon 72 (17). The fact that our assay can simultaneously perform 
mutation detection and correct identification of codon 72 status 
adds further weight to its usefulness as a one-step assay in clinical 
or epidemiological studies. 

The TP53 detection limit for known alleles was identified as low 
as in 5%. The actual limit could sometimes be even less than 5%, 
but in real life the possible alleles are mostly unknown and reliable 
control and comparison with results obtained with standard meth- 
ods can be technically difficult because of their own error rates. The 
dideoxy chain termination sequencing (ref 18; Fig. 5) and the 
pyrosequencing (15) are operating at a 30% detection limit of 
mutation-specific signals. Heteroduplex analysis techniques like 
TTGE have 10"- sensitivity under optimized conditions (19), but 
the most commonly used screening method, single-strand confor- 
mational polymorphism, has been shown to produce also 5% 
false-positive results (20). A potential explanation is misincorpo- 
ration of bases in PGR. Therefore, the fraction of mutated DNA 
was not further diluted, and the TP53 APEX-based sequencing was 
evaluated with tumor samples in a blind test. 

The total number of mutations determined in 11 esophageal 
cancer samples was similar by both APEX and TTGE plus 



Sample 25 Arg273Cys, CGT TGT 



ft C 0 




3 &ati.s«sse 

Fig. 5. Missense mutation Arg273Cy5 CGT-^TGT, difficult to identify by 
automated dideoxy sequencing. (A) First base of TP53 codon 273, analyzed by 
APEX-based sequencing. The signals corresponding to T in the sense strand 
and A in the antisense strand are indicative for mutation. (S) Automated 
dideoxy sequencing images corresponding to the Arg273Cys mutation from 
both DNA strands. The indicated mutation-specific peaks are in the range of 
the background noise and can be easily missed by visual analysis. 

sequencing. Two silent, six missense, one splice-site mutation, 
and an insertion were concordantly identified (Table 1). A 
missense mutation at the codon 290 was found by APEX instead 
of a silent-point mutation as identified by TTGE plus dideoxy 
sequencing. One missense mutation not previously identified by 
TTGE plus sequencing was de novo identified by APEX. Two 
samples with missense mutations escaped identification by 
APEX. However, in these specimens, identification of the 
mutation was possible only by dideoxy sequencing of a PGR 
product generated from excised TTGE bands with abnormal 
migration patterns, indicating that mutant DNA was present 
only in a tiny fraction of the tumor. The latter results suggest that 
performance and sensitivity of the APEX-based sequencing 
could be enhanced and all of the mutations possibly identified by 
use of enrichment techniques such as microdissection of tumor 
cells from the sample. 

In conclusion, we have developed and evaluated an APEX- 
based sequencing test at the scale of the almost complete 
TP53-coding sequence, providing an accurate and cost-efficient 
tool for DNA sequence analysis of this frequently mutated gene. 
Novel analysis algorithms were developed enabling automatic 
sequencing. The evaluation test with tumor samples showed 
performance comparable with one of the most sensitive and also 
laborious technologies available, dideoxy sequencing of hetero- 
duplex band obtained by TTGE. However, due to the reduced 
number of steps in template preparation and the possibility of 
performing automated analysis, APEX is much more suitable for 
developing tests for high-throughput in clinical diagnostics and 
large scale epidemiological studies. 

Note Added in Proof. When this manuscript was in process, a paper 
describing the resequencing of exon 7 in the TP53 gene was published (21). 

We thank Dr. A. Kristjuhan and V. Jaks, who kindly provided the 
mutant TP53 cDNA clones; E. Haamer, L. Land, I. Valvas, and V. Soo 
for technical assistance; Dr. K. Kask for critical reading of the 
manuscript; and Mrs. Krista Liiv for superior organizing work 
throughout the whole project. This work was supported in part by 
research grants from the European Community (IC15-CT98-0309), 
the Estonian Science Foundation (4479), and the Estonian Ministry of 
Education (Core Grant 0181518s98). A. Metspalu received the Vis- 
iting Scientist Award of the Internationa! Agency for Research on 
Cancer, Lyon, France. 



Tfinisson ef a/. 



PNAS I April 16, 2002 | vol. 99 | no. S | 5507 



1. Thorlacius, S., Borresen, A. L. & Eyfjord, J. E. (1993) Cancer Res. 53, 
1637-1641. 

2. Wen, W, H.. Re)es, A., Runnebaum, I. B., SuIHvan-Halley, J., Bernstein, U, 
Jones, L. A., Felix, J. C, Kreienberg, R., eJ-Naggar, A. & Press, M. F. (1999) 
Int. J. Gynecol. Pathol. 18, 29-41. 

3. Aas, T., Borresen, A. L., Geisler, S., Smith-Sorensen, B., Johnsen, H., Varhaug, 
J. E., Akslen, L. A. & Lonning, P. E. (1996) Nat. Med. 2, 811-814. 

4. Cabelguenne, A., Blons, H., de Waziers, 1., Camot, F., Houllier, A, M., Soussi, 
T., Brasnu, D., Beaune, P.. Laccourreye, O. & Laurent- Puig, P. (2000)7. Clin. 
Oncol 18. 1465-1473. 

5. Kurg, A., Tonisson, N., Georgiou, I., Shumaker, J., ToIIett, J. & Metspalu, A. 
(2000) Genet. Test 4, 1-7. 

6. Hernandez-Boussard, T., Rodriguez-Tome, P., Montcsano, R. & Hainaut, P, 
(1999) Hum. Mutat. 14, 1-8. 

7. Cronin, M. T., Fucini, R. V., Kim, S. M., Masino, R. S., Wcspi, R. M. & Miyada, 
C. G. (1996) Hum. Mutat. 7, 244-255. 

8. Guo, Z., Guilfoyle, R. A., ThicI, A. J., Wang, R. & Smith, L. M. (1994) Nucleic 
Acids Res. 22, 5456-5465. 

9. Felix, C. A., Brown, D. L., Mitsudomi, T., Ikagaki, N., Wong, A., Wasscrman, 
R., Womcr, R. B. & Bicgel, J. A. (1994) Oncogene 9, 327-328. 

10. Head, S. R., Rogers, Y. H., Parikh, K., Lan, G., Anderson. S., Goelct, P. & 
Boyce-Jacino, M. T. (1997) Nucleic Acids Res. 25, 5065-5071. 

IK Tonisson, N., Kurg, A., Lohmussaar, E. & Metspalu, A. (2000) in Microarray 
Biochip Technology, cd. Schena, M. (Eaton Publishing, Natick, MA), pp. 
247-263. 



12. Ahrcndt, S. A., Halachmi, S., Chow, J. T., Wu, L., Halachmi, N., Yang, S. C, 
Wehagc, S., Jen, J. & Sidransky, D. (1999) Proc. Natl Acad. ScL USA 96, 
7382-7387. 

13. Wen, W. H., Bernstein, L., Lescallett, J., Beazer- Barclay, Y., SuIIivan-Halley, 
J., White, M. & Press, M. R (2000) Cancer Res. 60, 2716-2722. 

14. Wikman, F. P., Lu, M. L., Thykjaer, T., Olescn, S. H., Andersen, L. D., 
Cordon-Cardo, C. & Orntoft, T. F. (2000) Clin. Chem. 46, 1555-1561 . 

15. Garcia, C. A., Ahmadian, A., Gharizadeh, B., Lundeberg, J., Ronaghi, M. & 
Nyren, P. (2000) Gene 253, 249-257. 

16. Storey, A., Thomas, M., Kalita, A., Harwood, C., Gardiol, D., Mantovani, F., 
Breuer, J., Leigh, I. M., Matlashewski, G. & Banks, L. (1998) Nature (London) 
393, 229-234. 

17. Tada, M., Furuuchi, K., Kancda, M., Matsumoto, J., Takahashi, M., Hirai, 
A., Mitsumoto, Y., Iggo, R. D. & Moriuchi, T. (2001) Carcinogenesis 22, 
515-517. 

18. Roscnblum, B. B., Lee, L. G., Spurgeon, S, L., Khan, S. H., Menchcn, 
S. M., Hciner, C R. & Chen, S. M. (1997) Nucleic Acids Res. 25, 
4500-4504. 

19. Bjorhcim, J., Lystad, S., Lindblom, A., Krcssner, U., Westring, S., Wahlberg, 
S., Lindmark, G., Gaudemack. G., Ekstrom, P., Roc, J., et al. (1998) Mutat. Res. 
403, 103-112. 

20. Yuan, B., Thomas, J. P., von Kodolitsch, Y. & Pyeritz, R. E. (1999) Hum. Mutat. 
14, 440-446. 

21. Shumaker. J. M., Toilet, J. J., Filbin, K. J., Montague-Smith, M. P. & Pirrung, 
M. C. (2001) Bioorg. Med. Chem. 9, 2269-2278. 



5508 I www.pnas.org/cgi/dot/10.1073/pnas.082100599 



TSnisson ef at. 



GENETIC TESTING 
Volume 4, Number 1, 20(M> 
Mary Ann Llebert, Inc. 



Arrayed Primer Extension: Solid-Phase Four-Color DNA 
Resequencing and Mutation Detection Technology 

ANTS KURG.i NEEME TCNISSON/ * IOANNIS GEORGIOU,^ JOHN SHUMAKER,^ JEFF TOLLETI.^ 

and ANDRES METSPALU*-^ 



ABSTRACT 

The technology and application of arrayed primer extension (AP£X) is presented. We describe an integrated 
system with DNA chip and template preparation, multipiex primer extension on the array, fluorescence imag- 
ing, and c^ta analysis. The method is based upon an array of oligonucleotides, immobilized via the 5' end on 
a glass surface. A patient DNA is amplified by PCR, digested enzymatically, and annealed to the immobilized 
primers, which promote sites for template-dependent DNA polymerase extension reactions using four unique 
fluorescently labeled dideoxy nucleotides. A mutation is detected by a change In the color code of the primer 
sites. The technology was applied to the analysis of 10 common /^-thalassemia mutations. Nine patient DNA 
samples, each of which carries a dilTerent mutation, and four wild-type DNA samples were correctly identi- 
fied. The signal-to-noise ratio of this technology is, on the average, 40:1, which enables the identification of 
heterozygous mutations with a high confidence levd. The APEX method can be applied to any DNA target 
for eifident analysis of mutations and potymorphisms. 



INTRODUCTION 

GENETICS AND MOLECULAR MEDICINE have aa cxpandiog need 
for rapid genotyping, mutation analysis, and DNA rese- 
quencing technologies that have a clear potential for miniatur- 
ization, paralleiization. and automatton and enable high 
throughput and ability to identify changes precisely in the pa- 
tient DNA. Conventional methods for mutation detection, such 
as single strand confomiacion polymorphism (SSCP), denatur- 
ing gradient gel electrophoresis (DGGE). chemical cleavage, or 
direct sequencing are lime and labor intensive (Cotton et ai, 
1998) with liale paralleiization. A promising solution for this 
technological need is the use of otigonucleotide arrays using 
nucleic acid hybridizadoo (Chee et at, 1996; Cronin et al„ 
1996; Hacia et ai, 1996) or hybridization coupled with an en- 
zyme-mediated reaction, either by primer extension (Shumaker 
et al, 1996; Head et aL 1997; Pastinen et al, 1997) w Hga-* 
tion (Landegren et ai, 1998). The most developed approach to- 
day. hybridization of labeled target to high-density oligonu- 
cleotide microarrays ie.g.^ Affymeciix CeoeChip™ arrays), is 
a revolutionary ntethod for DNA sequence analysis. Feasibil- 



ity studies of this aK>ro3ch are prorabing and the first results 
are impressive. However, the high complexity of the assays due 
to the large number of otigonucleotide probes per target se- 
quence base, sensitivity to hybridization conditions, compli- 
cated data analysis, and high cost are driving several research 
groups (0 look for alternative technologies. 

Here we present Arrayed Primer Extension (APEX) tech- 
nology as an alternative to array-based DNA sequence analy- 
sis by hybridization. We describe an integrated system with chip 
and template preparation, multiplex primer extension on the ar- 
ray, fluorescence imaging, and data analysis. The method is 
based upon a two-dimensional (2D) array of oligonucleotides, 
inunobilized via the 5' terminal amino group onto an epoxy- 
silanized glass support This method can be viewed as dye-ter- 
minator sequencing of DNA. but instead of using one primer 
and analyzing hundreds of extension products in polyaciyl- 
amide gel electrophoresis (PAGE), we can use hundreds to 
thousands of primers that are spadally separated and extend 
each by only one dye-labeled nucleotide. Ttie single-tube sann 
ple preparation protocol consists of PCR amplification followed 
by a DNA fragmentation reaction. After hybridization of the 



'lostitnie of Motecutar and Cell Biology. Tanu Chtldrens Hospital, University of Tami. Esiooian Bioceatre. Taini SIOIO. Eswnia. 
Medical School, Uaiversity of Joannina, loanntna 45500. Greece. 
'Baylor College of Medicine. Houston. TX, 77030. 
^Asptr Ltd., Taim 51014. Estonia. 



1 



2 



KURG ET AL. 



target DNA to the array, target-depetKtcnt oligonucleotide ex- 
tctision by a DNA polymerase is used to incorporate fluorts- 
cetitly labeled didcoxy terminators onto the primers (Head et 
al, 1997). 

We have developed a total internal reflection fluorescence 
(TIRF) excitation mechanism combined with a charge coupled 
device (CCD) detector for high-throughput image acquisition. 
The signals from the spectrally scpamied dyes are resolved to 
better than 2% cross-talk with dual selection by laser excitation 
and bandpass filtering of the emined fluorescence. Imaging is 
followed by a software analysis to convert the fluorescence in- 
formation into sequence data. The APEX method combines the 
advantage of both high information concent of the oligonu- 
cleotide array and fidelity of the enzymatic primer extension 
reaction. The enzyme acts as a biological proofreading mech- 
anism, discriminating against 3' end mismatches. Moreover, by 
using primer extension, each position on the array identifies a 
unique base of the sequence as die result of the direct compe- 
tition of four dye-terminators for the same spot, thus reducing 
the array complexity by at least a factor of four. Signal-to-noise 
ratio is improved by the added fidelity of the polymerase. The 
elimination of intensity comparisons across multiple spots, as 
is the case for hybridization assays, makes the analysis more 
robust. Two unique oligonucleotide prinoer^ probe the sense and 
antisensc target strands at the same base location. APEX can 
be used co analyze known point muiadoos, deledons, and in- 
sertions and can identify the presence of unknown polymor- 
phisms. 

We applied APEX for the identification of 10 common point 
mutations in the human j5-globin gene (Cao etai., 1997; Huis- 
man et al, 1998). causing /3- thalassemia in their homozygous 
state. ^-Thalassemia is a very common autosomal recessive dis- 
order in populations of Mediterranean. Middle Eastern, and Far 
Eastern descent It has been estimated that approximately 240 



million people worldwide are heterozygotes for jS-thalassemia 
and ai least 200,000 affected individuals are bom annually (C^o 
et (d., 1997). The ^-globin gene is rather short for a human 
gene (-1.5 kb). but harbors more than 150 mutations world- 
wide. Despite this heterogeneity, each at-risk population has iu 
own spectmm of 5-10 common mutations (Cao et al., 1997). 
Due to this phenomenon, an array for identificadon of 10 conn- 
mon mutations would offer a platform for high-throughput ge- 
netic testing of ^-thalassemia in a given population. The mu- 
tations studied in this work are typical for the Mediterranean 
region (see Table I). Extra primers are needed to expand the 
lO-mutatioD-specific chip presented here to a human /3-globin 
gene resequencing chip, allowing the analysis of nucleoude 
changes regardless of patient origin or mutadon location. 

In this report, we describe the APEX technology and present 
the results of a /3-thalassemia mutation study. Nine DNA sam- 
ples from patients and canriers, each of which carries a differ- 
etu mutatioa and four wkkl-type DNA samples, were corrccUy 
identified. 

MATERIALS AND METHODS 

DM4 sequencing 

DNA samples were previously sequenced by the Sanger 
didcoxy chain-termination method. 

Template DNA preparation 

A 1,420-bp fragment of the human /3-globin gene (NID 
g 183829; accession no. M36640) was amplified from human 
genomic DNA using the following primer sequences: 
Forward primer 5 -ACAGGTACGGCTGTCATCAC-3': 
Reverse primer S'-AGAATAATCXAGCCTTATCCC-S'. 



Table I. Ten Common Point Mutations prom the ^-Globin Gene for Identification wrh APEX 



Mutation 



Sequence 
change 



Class 



Origin 



Primers 
(sense/ctntisensej 



-87 

Codon 5 

Codon 6 

IVS-l-l 

IVS-T-5 

IVS.I-6 

IVS-I-110 

Codon 

39 
IVS-IM 

IVS-ll-745 

Markers 



C G Transcriptional 
mutant 

ACT Frameshifl 



AA 



Frameshift 



0 A Splice junction 
change 

G A Consensus change 

T -> C Consensus change 

G A Internal IVS 
change 

C T Nonsense 

G -* A Splice junction 

change 
C-^G Internal IVS 

change 



Mediierranean 

Mediteinuiean 

Balkans 

Meditcrraneaa 

Mediterranean, 

Mediterranean, 

Mediterranean 

Mediterranean 

Mediierranean, 

European 
Mediterranean, 

American Black, Asian 
Mediterranean. 



5'-TAOA<XTCACCCrGTGCAGCCACAC 

5'-CTGGGAGTAGATTGGCCAACCCTAG 

5'-ACAGACACCATGGTGCACCTGACTC 

5'-CGGCAGTAACGGCAGACTTCTCCTC 

5'-GACACCATGGTGCACCTGACTCCTG 

5'-CAGGGCAGTAACGGCAGA(nTCTCC 

5'-AAGTTGGT<jGTGAGGCCCrrG(XlCA0 

3'-ACCTCTCTTGTAACCTTGATACCAA 

5'-TOGTGGTGAGGCCCT(3GGCAGGTTG 

5'-TTAAACCTGTCrrGTAACCTTGATA 

S'-GGTGGTGAGGCCCTGGGCAGGTTGG 

5' -CTTAA ACCTGTCTTXjTA ACCTTG AT 

5'-TAGGCA(rrGACTCTCTCTGCCTATT 

5'-GCA<j(XTAAGGGTGGGAAAATAGAC 

5'-GCTGCTGGT(3GTCTAC(XTTGGACC 

5'-TCCCCAAAGGACTCAAACAACCTCT 

5'-GCACGTGGATCCTGAGAA(nTCAGG 

5'-AAACATCAAGGGTCCCATAGA<nCA 

5'-ATr(3CrAATAOCAG(7rACAATCCAG 

5'-ACCATAAAATAAAAGCAGAATGGTA 

5'-TTTAGCCTTAACGCCT2frGACGTCA 

X «= A for self-extending T marker 

X = C for self-extending G marker, etc. 



ARRAYED PRIMER EXTENSION 



3 



The PCR primers were obtained from Life Technologies, Inc. 
(Gaithersburg, MD). The amplification mixture was prepared 
and distributed into SO-/il aliquots The mixture contained: 5 jitl 
of lOx PCR buffer (containing 200 mM Tris-HCI, pH 8.4» 500 
mAf KCl (Life Technologies), 2.5 mM MgCb, 0.25 mAf of each 
deoxy nucleotide triphosphate (dATP, dCTP, dGTP), 0.2 mAf 
dTTP, 0.05 mM dUTP (Amersham Pharmacia Biotech.. Inc.* 
Milwaukee. Wl), 40 pmol of each primer, and 1 unit of Plat- 
inum Taq DNA Polymerase (Life Technologies). The amplifi- 
cation reactions were performed in a PTC-200 instrument (MJ 
Research, Inc.. Watertown, MA). First, an initial incubation at 
for 5 min. was performed, foUowcd by 34 amplification 
cycles consisting of denaturation at 94'C for 30 sec; primer an- 
nealing at 61 *C for 30 sec: and extension 72*0 for \ min. The 
final extension was at 72X for 5 min. 

The amplification products were initially concentrated and 
purified by ethanol precipitation in the presence of ammonium 
acetate. Fragmentation and functional inactivation of unincor- 
porated dNTPs was achieved in a one-step reaction by the ad- 
dition of 1/5 U of shrimp alkaline phosphatase (Amersham 
Pharmacia Biotech. Inc.) and W5 U of thermolabite uracil /V- 
glycosylase (Epicentre Technologies, Madison, WI) per one 
amplification product The reaction was incubated at 37*C for 
I hour and used directly in primer extension reactions. 

Oligonucleotide microchips 

Oligonucleotide primers were designed, according to the 
wild-type sequence of the human ^-globin gene, for both 
sense and antisense directions. 25-Mer oligonucleotides with 
amino linkers at their 3* ends were obtained from Genset 
(Paris, France). AU but Codon 5 scanning oligonucleotides 
were designed to scan 1 bp in the wild-type sequence. To 
look for the Codon 5 ACT franKshift mutation, an antisense 
primer with -t- 1 nucleotide shift in the sequence was used. 
Oligonucleotide primers were attached to an epoxy-activated 
glass surface via an amino linker at their 5' end (Southern et 
aL 1992; Lamture et at., 1994; Shumakcr tt at., 1996; Pasii- 
nen etal., 1997). Glass slides (24 X 60-mm; Fisherfmest Pre- 
mium Cover Glasses, Fisher Scientific, Pittsburgh, PA) were 
sonicated in acetone and 100 mM NaOH (S min both), rinsed 
in MilliQ water, and finally sonicated for 2 tnin with a solu- 
tion of 2% (3-glycidoxypropyl)trimethoxysilane (Gelest Inc.. 
Tullytown, PA) in 95% eihanol solution. Unbound silane and 
residual water was removed by brief rinsbg in 100% ethanol. 
E*rimcrs were diluted to 50 fiM concentration in 100 mM NaOH 
and spotted onto the activated surface with the TECAN RSP 
5031 pipening robot (TECAN AO, Hombrechlikon, Switzer- 
land) or a custom manufactured 25 Gauge, 96-tip capillary ar- 
rayer. The slides were stored in a dust-fiw environment ai 4*0 
until needed and washed twice in 95*C MilliQ water prior 
APEX reactions. Slides prepared this way are extremely sta- 
ble and can be used even after 15 months of storage. 

Arrayed primer extension reactions 

As estimated by comparison with a Gibco BRL mass ladder, 
2(X)-300 ng of the amplified product was used per one APEX 
reaction. The 20-;il primer extension reactions consisted of 10 
/il of fragmented product, 4 U of Thermo Sequenase DNA poly- 
merase (Amersham Pharmacia Biotech.), 2 fi\ of Thermo Se- 
quenase reaction buffer (260 mW Tris-HCI, pH 9.5. 65 mM 



MgCb) (Amersham Phannacia Biotech.), and 1 fxM final con- 
centration of each (luorcscently labeled ddNTP (Amersham 
Phannacia Biotech.. NEN, Boston, MA). The DNA in buffer 
was denatured at 9S''C, for 5 min. The enzyme and dye were 
immediately added to the other components and the whole mix 
was applied to prewaimed slides at 48''C. The reactions were 
allowed to proceed for 20 min under coverslips and stopped by 
washing at 80*C for 2 X 90 sec in MilliQ water. A droplet of 
SlowFadc* Light Antifadc Reagent (Molecular Prc^cs, Eugene, 
OR) was applied to the chips to limit bleaching of the fluores- 
cein. The signals were acquired by a custom built TIRF-based 
CCD detector. 

TIRF-based image detector 

The TlRF-basc CCD detector consists of a: (1) set of lasers 
used to excite the spectrally separable dye set, such as fluores- 
cein, Cy3. Texas Red and Cy5; (2) a mechanism for shuttering, 
expanding, and launching the lasers sequentially into the glass 
slides used as microarray substrates; (3) a filter wheel used for 
sequentially selecting the emission band-pass for each dye; and 
(4) an imaging lens and CCD imager for recotding the spatial 
fluorescence intensities. 

Light from the excitation lasers is directed along common 
paths. The ribbon of light strikes a continuous Itneariy recircu- 
lating mirror (dither mirror), which deflects the light upward 
toward the vertex of a prism. The cylinder lens directly in front 
of the prism has two functions: (1) focusing the ribbon in the 
narrow dimension at the prism vencx, and (2) producing a con- 
tinuously changing launch angle together with the dither mir- 
ror. The range of angles produced insurers that the excitation 
is uniform over the slide. Fluorescence is collected by a 60-mm 
f/2.8 MicroNikkor objective (Nikon. Japan) at a 3: 1 imaging 
ratio. The (^antix CCD camera (Photometries, Tucson, AZ) is 
cooled to -25*C and contains a Kodak KAF-1400 CCD chip 
with 1,037 X 1,315 pixels that are 6.8 jwn X 6.8 jxm square 
giving maximum resolution of 20 over the slide in the cur- 
rent imaging ratio. For arrays with fewer, larger spots, the full 
spatial resolution of the camera is not required and the CCD 
pixels ate "^binned" in a 2 X 2 fashion permitting 4X faster 
imaging times. The camera, custom-machined shutters, and 
FWl filter wheel (Integrated Scientific Imaging Systems. Inc.^ 
Santa Barbara, CA) are controlled by a customized version of 
Image Pro Plus™ software (Media Cybernetics, Inc., Silver 
Spring, MD). which autontaies the acquisition of the sequence 
of images required for each assay. More information about the 
TTRF based fluorescence detector can be found at (http-y 
Avww.asper.ee). 

Analysis of data 

Images were processed with the Image Pro Plus software. 
Patterns of all four incorporated nucleotides were recorded tm- 
dcr different color codes (A. yellow; T, cyan; C, red; and G, 
green). Four-color images were generated using a macro. 

RESULTS AND DISCUSSION 

Design of the assay 

An integral part of the assay is the DNA chip. We have used 
standard microscope cover glasses (24 X 60 mm) activated by 



4 



Taiilk 2. /3-Thalassemia APEX Sum- Ke\' 





M 


"^7 


Codon 5 




iVS-l'l 




IVSI~6 




Codon S9 




lVS'tI-745 


M 


Sense 


N 


C to G 


C lo T 


A to G 


C 10 A 


G to A 


T 10 C 


G lo A 


C lo T 


G n> A 


C to G 


N 


Atuisense 


A 


a 10 c 


A 10 G 


1^ CO C 


C to T 


C to T 


A Ki G 


C 10 T 


{; to A 


C to T 


G to C 


C 


Sense 


G 


C 10 G 


Cui T 


A to G 


G 10 A 


G 10 A 


T (0 C 


G U) A 


C 10 T 


C 10 A 


C to G 


T 


Antisensc 


N 


G to C 


A 10 G 


T to C 


Cto T 


C to T 


A 10 G 


C 10 T 


G lo A 


C to T 


G 10 C 


N 



The left and right coiunfuis consist of self-extending marker primers (M) and iht: middle 10 columns are dupJicatcs i>f Uic sense 
adjacent to aniisense primers for llie muUitipn sites listed in Tabic I. Tlie letters are showing analysis results expected from a 
wild- type DNA and mutations-, respectively. 



cpoxy-silanization (Southern ei ai, 1992: Lamiure et at., 1994) 
for oligonucleotide attachment. The primers were coupled to 
the activated slides by their 5' omiiip linker under alkaline con- 
ditions. The avcr-agc coupling efficiency was 5^. as estimated 
by immobiii/ation of radioaclively labeled oligonucleotide 
(data not shown ^ 



Template prepanition 

We have developed a single-tube template preparation pro- 
tocol, consisting of PGR amplification followed by a DNA frag- 
meniaiion reaction. A 1.4-kb fragment of die pattern DNA was 
first amplified by PCR. A fraction of the dTTPs was substituted 




^ Q # # O 

^ A C G T 

FIG. 1. Four-chamiel pseudocolor images of ^-thalassemia APEX analysis. Table 2 indicates the positions of bligonucleoiides 
on the array. A. Wild-rype and homozygous IVS-M 10 (G -> A; Tables 1 and 2) mutation. Dashed boxes indicate the fVS-M 10 
sense and antisense primer locations. B. DNA sample carrj'ing homozygous IVS-I-l 10 (G A; Tables 1 and 2) mutation. Com- 
p<md pseudocolor images rcprest?nting signals from coraplementury nucleotides are shown for clarity. DNA sample carrying 
hcieroxygous lVS-l-6 (T C; Tables 1 and 2) mutation. Wild-type allele is apparent in left panel while mutant allele h visible 
on the right panel. D. Color code of the images. 



ARRAYED PRIMER EXTENSION 



5 



IVS-t-5Mnf# 



40.0 
30.0 
20.0 
10.0 
O.0 



120.0 
80.0 
60.0 
30.0 
0.0 




rVS44 &ense 




120.0 
90.0 
60.0 
30.0 
0.0 



anttsente 




fVS-l-110fteiue 




IV84-110 ami»ense 




FIG. 1* Fluorescent intensities of three sequential mutation analysis sites on the )3-(haJasscmia oligonucleotide array (Tables I 
and 2). A, IVS-l-S site. Bars represent average fluorescent intensities of three independent experiments with a wild type target 
DNA. B. IVS-I-6 site. Average fluorescent intensities from two different experiments with a heterozygous target DNA. C IVS- 
I-l 10 site showing average fluorescent intensities from the same three independent experiments as (A). 



by dUTPs in the PCR mix allowing for later fragmentation with 
uracil A^-glycosylase (UNG> and heat treatment. In vivo, UNG 
acts as one of the most efficient DNA repair enzymes, hy- 
drolyzing specifically the A^-glycosylic bond connecting uracil 
to the deoxyribose sugar and generating abasic sites in DNA. 
In vitro, this reaction can be used for asymmetric fragmenta- 
tion of the template DNA fCronin et al, 1996). Replacement 
of 20% of dTTPs was optimal for ^thalassemia APEX. How- 
ever* other concentrations of dUTP might be needed for tem- 
plates with different lengths and thymidirke content Fragmen- 
tation offers several advantages for APEX by reducing both the 



effects of secondary structures, reducing the melting tempera- 
ture of target duplexes, and permitting the analysis of both 
strands simultaneously. Fragmentation also promotes greater 
mobility of the template and increases its efTective concentra- 
tion. In addition to the UNG treatment, several other possibil- 
ities exist for DNA template fragmentation, such as DNase 1 
treatment, restriction enzyme digestion, and mechanical shear- 
ing; however, none of these offers the combination of repro- 
ducibility, fragment size, staggered single-sided nicks, and as- 
say flexibility. 

After amplification, the PCR products were concentrated by 



6 



KURG ET AL. 




I^^L- — I Bound Ruoraphores 




Laser Ughl Oithortng fcfirror 



FIG. 3. Scheme of TIRF excitation/CCD imaging system. 
The excitation beam is trapped in the oligonucleotide array slide 
by total internal reflection at the slide/air interface. The launch 
angle is nominally 20* below horizontal. As the excitation beam 
travels down the oligonuclcoude array slide, it expands to fiU 
the slide uniformly. The evanescent field of the traj^sed beam 
is used to excite the dye molecules incorporated to the oligonu- 
cleotide primers near the surface. Variations in the excitation 
field are smoothed by the action of the dither minor. 



echanol precipitation. dNTPs carried over from the PCR reac- 
tion are a source of nonspecific extension noise in the APEX 
reaction and must be removed. The remaining dNTPs were de- 
graded enzyroatically using shrimp alkaline phosphatase (sAP). 
Simultaneously, the amplicons were treated with UNO. 

In die case of ^globin gene as the model system, amplifi- 
cation with a single pair of primers was sufficient to evahiate 
the common mutations. However, to apply the APEX approach 
to larger genes, amplification with multiple primers may be nec- 
essary. 

APEX reaction 

The majority of the ^-thalassemias are caused by either a 
single nucleotide substitution, or an oligoncutcotide addition or 
deletion that affects the coding region, or critical aieas, for the 
function of the ^-globin gene (Cao etci, 1997). Primer exten- 
sion reaction conditions were optimized with a wild-type DN A 
to achieve signals from all screened positions in the j3-globin 
gene. 

The specificity of the assays was monitored by immobilized 
self-elongating marler primers (Tables I and 2; Fig. U A, B 
and C) that are designed to fonn sctf-comptefl)entaiy homodu- 
plcxes at the 3' end, pcimilting template-independent signals 
for panicular dye terminators. All primers on the array gave 
signals as expected from the ^globin gene wild-type sequence 
(Table 2; Fig. lA). From die analyzed patient DNA, sample I 
was homozygous for the IVS-I-l 10 mutation (Fig. 1 , A and B) 
and 8 were heterozygous for the -87, Codon 5. Codon 6, IVS- 
I-l. rvS-1-6 (Fig. IC), lVS-n-1, Codon 39, and IVS.n-745 mu- 
tations, respectively, 

The signal-to-noise ratio of primer extension is 40:1, mea- 
sured as the average fluorescence value for all oligonucleotides 
on the anray from three different experiments (Fig. 2). This fa- 



vors identification of heterozygous mutations. Covalent bonds 
between die oligonucleotkie and dye terminator allow the slides 
to be stringendy washed, minimizing the nonspecific signals 
and reducing the background. 

Our goal was to make the AreX assay as simple and robust 
as possible: therefore, wc abandoned the two-step assay de- 
scribed earlier (Head et at, 1997; Pasiinen et al, 1997). To 
minimize manual operations with the DNA chips, we have used 
single-step APEX reactions. Our experiments have shown that 
both hybridization and template-dependent extension of arrayed 
primers can be achieved in the same reaction step without re- 
nrtoval of unbound template. The same goal applies to the re- 
action mix; it contains only absolutely necessary components — 
template DNA. fIuCH:escent]y labeled dideoxy nucleotides, and 
high-specificity DNA polymerase in its commercial buffer. Fur- 
thermore, the reaction conditions ait relatively insensitive to 
variations in the amount of dye terminator and polymerase. 

Some target-dependent primers show self-extension signals, 
e.g., Codon 39 and IVS-FI-l sense-strand primers gave signals 
from the wild-type sequence (tncorporation of C and G), as well 
as from self-extension (incorporation of A), respectively, due 
to formation of primer bridge structures (bomodimers) similar 
to the maricer primers used, or to hairpin structures. The primei? 
on the array are designed according to the wild-type sequence 
of the analyzed gene. Although the y end of the primers can- 
not be varied, the internal part of the primer may be changed 
by incorporating a mismatch to reduce primer self-compk- 
mentarity without seriously affecting the target-specific prim- 
ing ability. We are screening each mutation from bodi DNA 
strands. APEX analysis of the opposite strand does not use the 
complement of the problematic primer, and thus has a reason- 
able probability of avoiding the self-priming j^lem. In addi- 
tion, there are no sites in the current assay containing such prob- 
lematic primers for both strands. 

TIRF detection system and CCD imaging 

We have developed a detection system based on TIRF con- 
nected to a CCD image reader (Axelrod et al, 1984; Stimpson 
et al,. 1995). Figure 3 denK>nstraics the basic idea of the exci- 
tation scheme. A laser is deflected by a mirror and focused via 
a lens through a launch prism and index matching oil onto a 
glass slide. The intensity of the light field is nearly uniform 
along the length of the slide. Above the surface of the slide the 
intensity decreases exponentially, extending for approximately 
ooe-quarter of dte wavelength of light as an evanescent field, 
which excites the bound fluorophores residing near the surface. 
The emitted fluorescence is filiered 1o reject the background 
scatter noise, and adjacent dye signals are then collect^ by a 
CCD camera. Using one, two, or four dyes, four images are 
lained, one for each of tht four dye-labeled ddNTPs. The in- 
tensities of the imaged spots for each array elenient are com- 
pared and the largest signal will identify the nucleotide in the 
target sequence. When two signals are present at a location, a 
heterozygous status is indicated. 

The number of detecuble fluorescent labels can be adjusted 
according lo assay requirements by the addition (or removal) 
of corresponding lasers and filters. Two noain criteria exist for 
choice of dye-terminalor conjugates. First, they must be spec- 
trally separable from each other and, second, tbcy have to be 



ARRAYED PRIMER EXTENSION 

iiic<»porated by DNA polymerase. In this rcpon, the "four la- 
bels-one reaction** scheme with ddNTPs conjugated to fluores- 
cein, Cy3, Texas Red, and Cy5 was used for jS-thalassemia 
APEX reactions. 

The time required for the complete APEX analysis is less 
than 4 hours, including PGR and sample preparatitMi. However, 
much of the target preparation and APEX reaction can be per- 
formed in parallel, and the detection presents the major limit- 
ing step in high-te>ughput analysis. The TIRF-CCD detector 
is capable of reading one four-color slide per minute, present- 
ing an ultimate throughput of 60 slides per hour in the present 
design. For a small number of sites, such as the 10 sites pre- 
sented here, the assay can be analyzed visually. 

We propose that APEX offers a good platfcMrm for high- 
throughput genetic testing. The approach can be applied to any 
DNA target for analysis. The 40:1 signal-to-noise ratio enables 
identification of heterozygous mutations with comfortable con- 
fidence levels. 



ACKNOWLEDGMENTS 

The authors wish to diank Heidi Saulep and Viljo Soo for 
technical assistance. This work was supported by research 
grants from Eurc^an Community lC15-CT98-0309» Estonian 
Science Foundation 2492. and Core Grant 01 8051 8s98 from the 
Estonian Ministry of Education. J.$. and J.T. were partly sup- 
ported by a NIST ATP award 70NANB5H1 1 12. 



REFERENCES 

AXELROD, D« BURGHARDT. T.P„ and THOMPSON. NX. (1984). 
Total internal reflectjoo fluorescence. Anmi. Rev. Biopbys. Biocng. 
13. 247-268. 

CAO. A., SABA, L. GALANELLO. and ROSATELU M. (1997). 

Molecular diagnosis and carrier screening for ^alassemia. J. Am, 

Med. Assn. 27$, 1273-1277. 
CHEE. M.. YANG. R-. HUBBELU E.. BERNO. A.. HUANG. X.C.. 

STERN. D.. WINKLER. J.. LOCKHART. DJ,. MORRIS, M.S.. and 

F0DOR» S.P.A. (1996). Accessing genetic inforaiatioti with high- 

densiiy DNA arrays. Science 274, 610-614. 
CX>rrON, R.G.H.. EDKINS, E. and FORREST. S., eds. 1998. *futa- 

tioii Dtuctlon. A Practical Approach, Oxford Uoiversiiy Press, Iik., 

New York. 

CRONIN. M.T., FUCINI, R.V.. KIM, SAl.. MASINO. R.S., WESPI. 
R.M., and MIYADA. CO, (1996). Cystic fibrosis mutation dctcc- 



7 

tion by hybrktizatioa to light-generated DNA probe arrays. Hum. 
MutaL 7, 244-255. 

HAQA, J.G.. BRODY, L.C.. CHEE, M^., FODOR. SJ».A.. and 
COLLINS. F.S. ( 1996). Deiectioo of heterozygous mutalions in 
BRCAl using high density oligoujcleodde arrays and two-colonr 
fluorescence analysis. Nature Genet 14» 441-447. 

HEAD, S.R, ROGERS. Y.-H.. PARIKH. K., LAN. G., ANDERSON. 
S., GOELET. R. and BOYCE-JAONO, M.T. (1997). Nested genetic 
bit analysis (N-GBA) for mutation detecdon in the pS3 tumor su- 
pressor gene. Nucleic Acids Res. 2S» 5065-50^71. 

HUISMAN, TMX, CARVER, M,FJI-. and BAYSAU E (1998). A 
Syllabus cfThatassemia Mutations, (ed. by Sickle Cell Anemia Foun- 
dation, Augusta, GA). 

LAMTURE. J.B.. BEATTIE K.L., BURKE B.E. EGGERS, M.D.. 
EHRUCH. DJ.. FOWLER, R., HOLLIS. M.A., KOSICKL B.B,. 
REICH. R.K.. SMITH. SR.. VARMA. R.S., and HOGAN» M.E 

(1994) . Direct detection of nucleic acid hybridization on the surface 
of a charge coupled device. Nucleic Acids Res. 22, 2121-2125. 

LANGECREN, U., NILSSON. M.. and KWOK, R-Y. (1998). Read- 
ing bits of genetic infonnalioa: methods for single-nucleotide poly- 
morphism analysis. CJenome Res. 8, 769-776. 

PASTINEN, T.. KURG. A.. METSPALU, A., PELTONEN. L.. and 
SYVANEN, A.-C. (1997). Minisequcncing: a specific tool for DNA 
analysis and diagnostics on oligonucleotide arrays. Genome Res. 7, 
606-614. 

SHUMAKER, J.M.. METSPALU, A., and CASKEY. T, (1996). Mu- 
tation detection by solid phase primer extension. Hum. MutaL 7, 
^46-354. 

SOUTHERN. E.M.. MASKOS, U., atwl ELDER, J.K, (1992). Analyz- 
ing and comparing nucleic acid seqtiences by hybridization to arrays 
of oligonucleotides: evaluation using experimental models. Ge- 
nomics 13, 1008-1017. 

STIMPSON, D.l. HOUER, J.V., HSIEH, W.T., J(XJ, C. GORDON. 
J., THERIAULT. T., GAMBLE, R.. aivd BALDESCHWIELER, J.D. 

(1995) . Real-time detection of DNA hybridization and melting on 
oligonucleotide arrays by using optical wave guides. Ptoc. Nail. 
Acad. Sci. USA 92, 6379-6383. 

Address rejttint requests to: 
Frol Andres Metspalu 
Inst of Moiecutar and Celt Biology 
Chair of Biotechnology 
University of Tartu 
23 Rita St. Tartu 5WJ0 
Estonia 

E-mail: andrts@ebc.ee 

Received for publicaiion September 24, 1999; accepted [De- 
cember 28. 1999. 



